Audio Properties

This section discusses the sound capabilities of QuickTime and the Movie Toolbox. It has been divided into the following topics:

Sound Playback

As is the case with video data, QuickTime movies store sound information in tracks. QuickTime movies may have one or more sound tracks. The Movie Toolbox can play more than one sound at a time by mixing the enabled sound tracks together during playback. This allows you to put together movies with separate music and voice tracks. You can then manipulate the tracks separately but play them together. You can also use multiple sound tracks to store different languages.

There are two main attributes of sound in QuickTime movies: volume and balance. You can control these attributes using the facilities of the Movie Toolbox.

Every QuickTime movie has a current volume setting. This volume setting controls the loudness of the movie's sound. You can adjust a movie's current volume by calling the SetMovieVolume function. In addition, you can set a preferred volume setting for a movie. This value represents the best volume for the movie. The Movie Toolbox saves this value when you store a movie into a movie file. The value of the current volume is lost. You can set a movie's preferred volume by calling the SetMoviePreferredVolume function. When you load a movie from a movie file, the Movie Toolbox sets the movie's current volume to the value of its preferred volume.

Each track in a movie also has a volume setting. A track's volume governs its loudness relative to other tracks in the movie. You can set a track's volume by calling the SetTrackVolume function.

In the Movie Toolbox, movie and track volumes are represented as 16-bit, fixed-point numbers that range from -1.0 to +1.0. The high-order 8 bits contain the integer portion of the value; the low-order 8 bits contain the fractional part. Positive values denote volume settings, with 1.0 corresponding to the maximum volume on your computer. Negative values are muted, but retain the magnitude of the volume setting so that, by toggling the sign of a volume setting, you can turn off the sound and then turn it back on at the previous level (something like pressing the mute button on a radio).

A track's volume is scaled to a movie's volume, and the movie's volume is scaled to the value the user specifies for speaker volume using the Sound control panel. That is, a movie's volume setting represents the maximum loudness of any track in the movie. If you set a track's volume to a value less than 1.0, that track plays proportionally quieter, relative to the loudness of other tracks in the movie.

Each track in a movie has its own balance setting. The balance setting controls the mix of sound between a computer's two speakers. If the source sound is monaural, the balance setting controls the relative loudness of each speaker. If the source sound is stereo, the balance setting governs the mix of the right and left channels. You can set the balance for a track's media by calling the SetSoundMediaBalance function. When you save the movie, the balance setting is stored in the movie file.

In the Movie Toolbox, balance values are represented as 16-bit, fixed-point numbers that range from -1.0 to +1.0. The high-order 8 bits contain the integer portion of the value; the low-order 8 bits contain the fractional part. Negative values weight the balance toward the left speaker; positive values emphasize the left channel. Setting the balance to 0 corresponds to a neutral setting.

Adding Sound to Video

Most QuickTime movies contain both sound data and video data. If you are creating an application that plays movies, you do not need to worry about the details of how sound is stored in a movie. However, if you are developing an application that creates movies, you need to consider how you store the sound and video data.

There are two ways to store sound data in a QuickTime movie. The simplest method is to store the sound track as a continuous stream. When you play a movie that has its sound in this form, the Movie Toolbox loads the entire sound track into memory, and then reads the video frames when they are needed for display. While this technique is very efficient, it requires a large amount of memory to store the entire sound, which limits the length of the movie. This technique also requires a large amount of time to read in the entire sound track before the movie can start playing. For this reason, this technique is only recommended when the sound for a movie is fairly small (less than 64 KB).

For larger movies, a technique called interleaving must be used so that the sound and video data may be alternated in small pieces, and the data can be read off disk as it is needed. Interleaving allows for movies of almost any length with little delay on startup. However, you must tune the storage parameters to avoid a lower video frame rate and breaks in the sound that result when sound data is read from slow storage devices. In general, the Movie Toolbox hides the details of interleaving from your application. The FlattenMovie and FlattenMovieData functions allow you to enable and disable interleaving when you create a movie. These functions then interact with the appropriate media handler to correctly interleave the sound and video data for your movie. For more information about working with sound, see the chapter "Sound Manager" in Inside Macintosh: More Macintosh Toolbox .

Sound Data Formats

The Movie Toolbox stores sound data in sound tracks as a series of digital samples. Each sample specifies the amplitude of the sound at a given point in time, a format commonly known as linear pulse-code modulation (linear PCM). The Movie Toolbox supports both monaural and stereo sound. For monaural sounds, the samples are stored sequentially, one after another. For stereo sounds, the samples are stored interleaved in a left/right/ left/right fashion.

In order to support a broad range of audio data formats, the Movie Toolbox can accommodate a number of different sample encoding formats, sample sizes, sample rates, and compression algorithms. The following paragraphs discuss the details of each of these attributes of movie sound data.

The Movie Toolbox supports two techniques for encoding the amplitude values in a sample: offset-binary and twos-complement. Offset-binary encoding represents the range of amplitude values as an unsigned number, with the midpoint of the range representing silence. For example, an 8-bit sample stored in offset-binary format would contain sample values ranging from 0 to 255, with a value of 128 specifying silence (no amplitude). Samples in Macintosh sound resources are stored in offset-binary form.

Twos-complement encoding stores the amplitude values as a signed number--in this case silence is represented by a sample value of 0. Using the same 8-bit example, twos-complement values would range from -128 to 127, with 0 meaning silence. The Audio Interchange File Format (AIFF) used by the Sound Manager stores samples in twos-complement form, so it is common to see this type of sound in QuickTime movies.

The Movie Toolbox allows you to store information about the sound data in the sound description. See "The Sound Description Structure," for details on the sound description structure. Sample size indicates the number of bits used to encode the amplitude value for each sample. The size of a sample determines the quality of the sound, since more bits can represent more amplitude values. The basic Macintosh sound hardware supports only 8-bit samples, but the Sound Manager also supports 16-bit and 32-bit sample sizes. The Movie Toolbox plays these larger samples on 8-bit Macintosh hardware by converting the samples to 8-bit format before playing them.

Sample rate indicates the number of samples captured per second. The sample rate also influences the sound quality, because higher rates can more accurately capture the original sound waveform. The basic Macintosh hardware supports an output sampling rate of 22.254 kHz. The Movie Toolbox can support any rate up to 65.535 kHz; as with sample size, the Movie Toolbox converts higher sample rates to rates that can be accommodated by the Macintosh hardware when it plays the sound.

In addition to these sample encoding formats, the Movie Toolbox also supports the Macintosh Audio Compression and Expansion ( MACE) capability of the Sound Manager. This allows compression of the sound data at ratios of 3 to 1 or 6 to 1. Compressing a movie's sound can yield significant savings in storage and RAM space, at the cost of somewhat lower quality and higher CPU overhead on playback.